NSF PAR Search | NSF Public Access Repository

Information-theoretic quantification of inherent discrimination bias in training data for supervised learning

Aldarmini, Sokrat; Nafea, Mohamed (March 2025, Openreview.net: https://openreview.net/pdf?id=ZsqigefFO8 This work has been accepted for presentation at the "2nd Workshop on Navigating and Addressing Data Problems for Foundation Models (DATA-FM)" at ICLR 2025.)

Algorithmic fairness research has mainly focused on adapting learning models to mitigate discrimination based on protected attributes, yet understanding inherent biases in training data remains largely unexplored. Quantifying these biases is crucial for informed data engineering, as data mining and model development often occur separately. We address this by developing an information-theoretic framework to quantify the marginal impacts of dataset features on the discrimination bias of downstream predictors. We postulate a set of desired properties for candidate discrimination measures and derive measures that (partially) satisfy them. Distinct sets of these properties align with distinct fairness criteria like demographic parity or equalized odds, which we show can be in disagreement and not simultaneously satisfied by a single measure. We use the Shapley value to determine individual features’ contributions to overall discrimination, and prove its effectiveness in eliminating redundancy. We validate our measures through a comprehensive empirical study on numerous real-world and synthetic datasets. For synthetic data, we use a parametric linear structural causal model to generate diverse data correlation structures. Our analysis provides empirically validated guidelines for selecting discrimination measures based on data conditions and fairness criteria, establishing a robust framework for quantifying inherent discrimination bias in data

Free, publicly-accessible full text available March 5, 2026

Search for: All records